Keyword-Driven Suffix Arrays for On-Line Keyword Searching from Documents In Chinese
نویسندگان
چکیده
منابع مشابه
Keyword-driven Suffix Arrays for On-line Keyword Searching from Documents in Chinese
On-line keyword searching from documents in Chinese tends to use inverted indexing as the main technique, which has its difficulties. Suffix Array is widely used for processing text in Western languages. However, it fails to get widely used in Chinese processing because of the speciality of Chinese. Suffix Array is a powerful tool. However it costs too much space. That is the major bottleneck o...
متن کاملKeyword Searching for Arabic Handwritten Documents
In this paper we present a system for searching keywords in Arabic handwritten and historical documents using two algorithms, Dynamic Time Warping (DTW) and Hidden Markov Models (HMM). The HMM based system provides satisfying results when it is possible to provide adequate training samples (which is not always possible in historical documents). The DTW algorithm with a slight modification provi...
متن کاملAutomatic keyword extraction from individual documents
Keywords, which we define as a sequence of one or more words, provide a compact representation of a document’s content. Ideally, keywords represent in condensed form the essential content of a document.
متن کاملFast keyword detection using suffix array
In this paper, we propose a technique for detecting keywords quickly from a very large speech database without using a large memory space. To accelerate searches and save memory, we used a suffix array as the data structure and applied phoneme-based DP-matching. To avoid an exponential increase in the process time with the length of the keyword, a long keyword is divided into short sub-keywords...
متن کاملKeyword Spotting Techniques for Sanskrit Documents
With advances in the field of digitization of printed documents and several mass digitization projects underway, information retrieval and document search have emerged as key research areas. However, most of the current work in these areas is limited to English and a few oriental languages. The lack of efficient solutions for Indic scripts and languages such as Sanskrit has hampered information...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Artificial Intelligence & Applications
سال: 2012
ISSN: 0976-2191
DOI: 10.5121/ijaia.2012.3503